Picture for Tri Dao

Tri Dao

CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs

Add code
May 20, 2026
Viaarxiv icon

Search Your Block Floating Point Scales!

Add code
May 12, 2026
Viaarxiv icon

SAW-INT4: System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving

Add code
Apr 21, 2026
Viaarxiv icon

Introspective Diffusion Language Models

Add code
Apr 13, 2026
Viaarxiv icon

Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution

Add code
Apr 09, 2026
Viaarxiv icon

Mamba-3: Improved Sequence Modeling using State Space Principles

Add code
Mar 16, 2026
Viaarxiv icon

M$^2$RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling

Add code
Mar 15, 2026
Viaarxiv icon

AI+HW 2035: Shaping the Next Decade

Add code
Mar 05, 2026
Viaarxiv icon

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

Add code
Mar 05, 2026
Viaarxiv icon

Speculative Speculative Decoding

Add code
Mar 03, 2026
Viaarxiv icon